Internally Driven Q-learning - Convergence and Generalization Results

نویسندگان

  • Eduardo Alonso
  • Esther Mondragón
  • Niclas Kjäll-Ohlsson
چکیده

We present an approach to solving the reinforcement learning problem in which agents are provided with internal drives against which they evaluate the value of the states according to a similarity function. We extend Q-learning by substituting internally driven values for ad hoc rewards. The resulting algorithm, Internally Driven Q-learning (IDQ-learning), is experimentally proved to convergence to optimality and to generalize well. These results are preliminary yet encouraging: IDQ-learning is more psychologically plausible than Q-learning, and it devolves control and thus autonomy to agents that are otherwise at the mercy of the environment (i.e., of the designer).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the approximation by Chlodowsky type generalization of (p,q)-Bernstein operators

In the present article, we introduce Chlodowsky variant of $(p,q)$-Bernstein operators and compute the moments for these operators which are used in proving our main results. Further, we study some approximation properties of these new operators, which include the rate of convergence using usual modulus of continuity and also the rate of convergence when the function $f$ belongs to the class Li...

متن کامل

Two Novel Learning Algorithms for CMAC Neural Network Based on Changeable Learning Rate

Cerebellar Model Articulation Controller Neural Network is a computational model of cerebellum which acts as a lookup table. The advantages of CMAC are fast learning convergence, and capability of mapping nonlinear functions due to its local generalization of weight updating, single structure and easy processing. In the training phase, the disadvantage of some CMAC models is unstable phenomenon...

متن کامل

Connectionist Q-learning in Robot Control Task

The Q-Learning algorithm suggested by Watkins in 1989 [1] belongs to a group of reinforcement learning algorithms. Reinforcement learning in robot control tasks has the form of multi-step procedure of adaptation. The main feature of that technique is that in the process of learning the system is not shown how to act in a specific situation. Instead, learning develops by trial and error using re...

متن کامل

A distinct numerical approach for the solution of some kind of initial value problem involving nonlinear q-fractional differential equations

The fractional calculus deals with the generalization of integration and differentiation of integer order to those ones of any order. The q-fractional differential equation usually describe the physical process imposed on the time scale set Tq. In this paper, we first propose a difference formula for discretizing the fractional q-derivative  of Caputo type with order  and scale index . We es...

متن کامل

Semi-online neural-Q_leaming for real-time robot learning

Reinforcement Learning (RL) is a very suitable technique for robot learning, as it can learn in unknown environments and in real-time computation. The main difficulties in adapting classic RL algorithms to robotic systems are the generalization problem and the correct observation of the Markovian state. This paper attempts to solve the generalization problem by proposing the Semi-Online NeuralQ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012